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Abstract 

In a typical range emptiness searching (resp., reporting) problem, we are given a set P of 
n points in W^, and wish to preprocess it into a data structure that supports efficient range 
emptiness (resp., reporting) queries, in which we specify a range cr, which, in general, is a semi- 
algebraic set in of constant description complexity, and wish to determine whether Pflcr = 0, 
or to report all the points in P n cr. Range emptiness searching and reporting arise in many 
applications, and have been treated by Matousek [33j in the special case where the ranges are 
halfspaccs bounded by hyperplanes. As shown in |33] . the two problems are closely related, 
and have solutions (for the case of halfspaces) with similar performance bounds. In this paper 
we extend the analysis to general semi-algebraic ranges, and show how to adapt Matousek's 
technique, without the need to linearize the ranges into a higher-dimensional space. This yields 
more efficient solutions to several useful problems, and we demonstrate the new technique in 
four applications, with the following results: 

(i) An algorithm for ray shooting amid balls in , which uses 0(n) storage and 0*{n) pre- 
processing0 and answers a query in 0*{n^^'^) time, improving the previous bound of 0*(n'^/^). 

(ii) An algorithm that preprocesses, in 0*{n) time, a set P of n points in E'^ into a data 
structure with 0{n) storage, so that, for any query line £ (or, for that matter, any simply-shaped 
convex set), the point of P farthest from i can be computed in 0*(n^/^) time. This in turn 
yields an algorithm that computes the largest-area triangle spanned by P in time 0*(n^^/^^), 
as well as nontrivial algorithms for computing the largest-perimeter or largest-height triangle 
spanned by P. 

(iii) An algorithm that preprocesses, in 0*{n) time, a set P of n points in M? into a data 
structure with 0{n) storage, so that, for any query a-fat triangle A, we can determine, in 0*(1) 
time, whether A fl P is empty. Alternatively, we can report in 0*(1) + 0{k) time, the points of 
AnP, where k = |AnP|. 

(iv) An algorithm that preprocesses, in 0*{n) time, a set P of n points in into a data 
structure with 0(n) storage, so that, given any query semidisk c, or a circular cap larger than 
a semidisk, we can determine, in 0*(1) time, whether c H P is empty, or report the k points in 
cnPin 0*{l) + 0{k) time. 
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08-30272, by a grant from tlie U.S. -Israel Binational Science Foundation, and by the Hermann Minkowski-MINERVA 
Center for Geometry at Tel Aviv University. This work is part of the second author's Ph.D. dissertation, prepared 
under the supervision of the first author at Tel Aviv University. 
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1 



Adapting the recent techniques of [T^l [131 HH , we can turn our sohitions into efficient al- 
gorithms for approximate range counting (with small relative error) for the cases mentioned 
above. 

Our technique is closely related to the notions of nearest- or farthest-neighbor generalized 
Voronoi diagrams, and of the union or intersection of geometric objects, where sharper bounds 
on the combinatorial complexity of these structures yield faster range emptiness searching or 
reporting algorithms. 
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1 Introduction 



The main technical contribution of this paper is an extension of Matousek's range emptiness and 
reporting data structures [33] (see also [7] for a dynamic version of the problem) to the case of 
general semi-algebraic ranges. 

Ray shooting amid balls. A motivating application of this study is ray shooting amid balls in 
M^, where we want to construct a data structure of linear size with near-linear preprocessing, which 
supports ray shooting queries in sublinear time. Typically, in problems of this sort, the bound on 
the query time is some fractional power of n, the number of objects, and the goal is to make the 
exponent as small as possible. For example, ray shooting amid a collection of n arbitrary triangles 
can be performed in 0*(n^/^) time (with linear storage) [6]. Better solutions are known for various 
special cases. For example, the authors have shown |41| that the query time can be improved to 
0*(re^/^), when the triangles are all fat, or are all stabbed by a common line. 

At the other end of the spectrum, one is interested in ray shooting algorithms and data structures 
where a ray shooting query can be performed in logarithmic or poly logarithmic time (or even 0{n^) 
time, for any e > 0; this is 0*(1) in our shorthand notation). In this case, the goal is to reduce 
the storage (and preprocessing) requirements as much as possible. For example, for arbitrary 
triangles (and even for the special case of fat triangles), the best known bound for the storage 
requirement (with logarithmic query time) is 0*(n^) [Il[6]. For balls, Mohaban and Sharir |37| . 
gave an algorithm with 0*{n^) storage and 0*(1) query time. However, when only linear storage 
is used, the previously best known query time (for balls) is 0*(n^/^) (as in the case of general 
triangles). In this paper we show, as an application of our general range emptiness machinery, that 
this can be improved to 0*{'n?/^) time. 

When answering a ray-shooting query for a set 5 of input objects, one generally reduces the 
problem to that of answering segment emptiness queries, following the parametric searching scheme 
proposed by Agarwal and Matousek [5] (see also Megiddo [36] for the original underlying technique) . 

A standard way of performing the latter kind of queries is to switch to a dual parametric space, 
where each object in the input set is represented by a point. A segment e in is mapped to a surface 
de, which is the locus of all the points representing the objects that e touches (without penetrating 
into their interior). Usually, partitions the dual space into two portions, one, cr^, consisting 
of points representing objects whose interior is intersected by e, and the other, a~ , consisting of 
points representing objects that e avoids. The segment-emptiness problem thus transforms into a 
range-emptiness query: Does <t^ contain any point representing an input object? 

Range reporting and emptiness searching. Range-emptiness queries of this kind have been 
studied by Matousek [33] (see also Agarwal and Matousek [7]), but only for the case where the 
ranges are halfspaces bounded by hyperplanes. For this case, Matousek has established a so-called 
shallon.-cutt^ng lemma, that shows the existence of a (l/.)-cuttini that covers the complement of 
the union of any m given halfspace ranges, whose size is significantly smaller than the size of a 
(l/s)-cutting that covers the entire space. This lemma provides the basic tool for partitioning a 
point set P, in the style of [M], so that shallow hyperplanes (those containing at most n/r points of 
P below them, say, for some given parameter r) cross only a small number of cells of the partition 

^This is a partition of space (or a portion thereof) into a small number of simply-shaped cells, each of which is 
crossed by at most n/s of the n given surfaces (hyperplanes in this case). See below for more details. 
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(see below for more details). This in turn yields a data structure, known as a shallow partition 
tree, that stores a recursive partitioning of P, which enables us to answer more efficiently halfspace 
range reporting queries for shallow hyperplanes, and thus also halfspace range emptiness queries. 
Using this approach, the query time (for emptiness) improves from the general halfspace range 
searching query cost of 0*(ni~i/'^) to 0*(ni^VLrf/2j). Reporting takes 0*(ni~i/ L'^/^J +k), where k 
is the output size. 

Consequently, one way of applying this machinery for more general semi-algebraic ranges is to 
"lift" the set of points and the ranges into a higher-dimensional space by means of an appropri- 
ate linearization, as in [6], and then apply the above machinery. (For this, one needs to assume 
that the given ranges have constant description complexity, meaning that each range is a Boolean 
combination of a constant number of polynomial equalities and inequalities of constant maximum 
degree. However, if the space in which the ranges are linearized has high dimension, the resulting 
range reporting or emptiness queries become significantly less efficient. Moreover, in many ap- 
plications, the ranges are Boolean combinations of polynomial (equalities and) inequalities, which 
creates additional difficulties in linearizing the ranges, resulting in even worse running time. 

An alternative technique is to give up linearization, and instead work in the original space. As 
follows from the machinery of [33] (and further elaborated later in this paper), this requires, as a 
major tool, the (existence and) construction of a decomposition of the complement of the union 
of m given ranges (in the case of segment emptiness, these are the ranges af, for an appropriate 
collection of segments e), into a small number of "elementary cells" (in the terminology of [6] — 
see also below). Here we face, especially in higher dimensions, a scarcity of sharp bounds on the 
complexity of the union itself, to begin with, and then on the complexity of a decomposition of 
its complement. Often, the best one can do is to decompose the entire arrangement of the given 
ranges, which results in too many elementary cells, and consequently in an algorithm with poor 
performance. 

To recap, in the key technical step in answering general semi-algebraic range reporting or empti- 
ness queries, the best current approaches are either to construct a cutting of the entire arrangement 
of the range-bounding surfaces in the original space, or to construct a shallow cutting in another 
higher-dimensional space into which the ranges can be linearized. For many natural problems 
(including the segment-emptiness problem), both approaches yield relatively poor performance. 

As we will shortly note, in handling general semi-algebraic ranges, we face another major 
technical issue, having to do with the construction of efficient test sets of ranges (in the terminology 
of [6j, elaborated below). Addressing this issue is a major component of the analysis in this paper, 
and is discussed in detail later on. 

Our results. We propose a variant of the shallow-cutting machinery of |33j for the case of semi- 
algebraic ranges, which avoids the need for linearization, and works in the original space (which, 
for the case of ray shooting amid balls, is a 4-dimensional parametric space in which the balls are 
represented as points). While the machinery used by our variant is similar in principle to that in 
[33], there are several significant technical difficulties which require more careful treatment. 

Matousek's technique [33], as well as ours, considers a finite set Q of shallow ranges (called a 
test set), and builds a data structure which caters only for ranges in Q. Matousek shows how to 
build, for any given parameter r, a set of halfspaces of size polynomial in r, which represents well all 
(n/r)-shallow ranges, in the following sense: For any simplicial partitior^ li with parameter r, let 

■^Briefly, this is a partition of P into 0(r) subsets of roughly equal size, each enclosed by some simplex (in the 
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K denote the maximal number of cells of IT crossed by a halfspace in Q. Then each (n/r)-shallow 
halfspace crosses at most ck cells of 11, where c is a constant that depends on the dimension. 
Unfortunately (for the present analysis), the linear nature of the ranges is crucially needed for the 
proof, which therefore fails for non- linear ranges. 

Being a good representative of all shallow ranges, in the above sense, is only one of the require- 
ments from a good test set Q. The other requirements are that Q be small, so that, in particular, 
it can be constructed efficiently, and that the (decomposition of the) complement of the union of 
any subset of Q have small complexity. All these properties hold for the case of halfspaces bounded 
by hyperplanes, studied in [33]. 

As it turns out, and hinted above, obtaining a "good" test set Q for general semi-algebraic 
ranges, with the above properties, is not an easy task. We give a simple general recipe for con- 
structing such a set Q, but it consists of more complex ranges than those in the original setup. A 
major problem with this recipe is that since the members of Q have a more complex shape, it be- 
comes harder to establish good bounds on the complexity of (the decomposition of) the complement 
of the union of any subset of these generalized ranges. 

Nevertheless, once a good test set has been shown to exist, and to be efficiently computable, 
it leads to a construction of an efficient elementary-cell partition with a small crossing number for 
any empty or shallow original range. Using this construction recursively, one obtains a partition 
tree, of linear size, so that any shallow original range 7 visits only a small number of its nodes 
(where 7 visits a node if it crosses the elementary cell enclosing the subset of that node, meaning 
that it intersects this cell but does not fully contain it), which in turn leads to an efficient range 
reporting or emptiness-testing procedure. This part, of constructing and searching the tree, is 
almost identical to its counterparts in the earlier works [U [33l [34] , and we will not elaborate on it 
here, focusing only on the technicalities in the construction of a single "shallow" elementary-cell 
partition. 

Developing all this machinery, and then putting it into action, we obtain efficient data structures 
for the following applications, improving previous results or obtaining the first nontrivial solutions. 
These instances are: 

Ray shooting amid balls in 3-space. Given a set 5 of n balls in R'^, we construct, in 0*{n) 
time, a data structure of 0{n) size, which can determine, for a given query segment e, whether 
e is empty (avoids all balls), in 0*(n^/^) time. Plugging this data structure into the parametric 
searching technique of Agarwal and Matousek [5], we obtain a data structure for answering ray 
shooting queries amid the balls of S, which has similar performance bounds. 

We represent balls in 3-space as points in M^, where a ball with center (a, b, c) and radius r is 
mapped to the point (a, b, c, r), and each object C is mapped to the surface ax, which is the 
locus of all (points representing) balls tangent to K (i.e., balls that touch K, but do not penetrate 
into its interior). In this case, the range of an object K is the upper halfspace cr^ consisting of 
all points lying above ax (representing balls that intersect K). The complement of the union of a 
subfamily of these ranges is the region below the lower envelope of the corresponding surface^ ax- 
The minimization diagram of this envelope is the 3-dimensional Euclidean Voronoi diagram of the 
corresponding set of objects. Thus we reveal (what we regard as) a somewhat surprising connection 

linear case) or some elementary cell (in the general semi-algebraic case); see and Section [3] below. 

*In our solution, we will use a test set of objects K which are considerably more complex than just lines or 
segments, but are nevertheless still of constant description complexity. 
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between the problem of ray shooting amid balls and the problem of analyzing the complexity of 
Euclidean Voronoi diagrams of (simply-shaped) objects in 3-space. 

Farthest point from a line (or from any convex set) in M"^. Let P be a set of n points 
in M'^. We wish to preprocess P into a data structure of size 0{n), so that, for any query line i, 
we can efficiently find the point of P farthest from i. This is a useful routine for approximating 
polygonal paths in three dimensions; see |21j . 

As in the ray shooting problem, we can reduce such a query to a range emptiness query of 
the form: Given a cylinder C, does it contain all the points of P? (That is, is the complement of 
the cylinder empty?) We prefer to regard this as an instance of the complementary range fullness 
problem, which seeks to determine whether a query range is full (i.e., contains all the input points). 

Our machinery can handle this problem. In fact, we can solve the range fullness problem for 
any family of convex ranges in 3-space, of constant description complexity. Our solution requires 
0{n) storage and near linear preprocessing, and answers a range fullness query in 0*(n^/^) time, 
improving the query time 0*(n^/^) given by Agarwal and Matousek [6]. 

We then apply this result to solve the problem of finding the largest-area triangle spanned by 
a set of n points in 3-space. The resulting algorithm requires 0*(n^^/^^) time, which improves a 
previous bound of 0*(n^^/^) due to Daescu and Serfling [21]. We also adapt our machinery to 
compute efficiently the largest-perimeter triangle and the largest-height triangle spanned by such 
a point set. 

In both this, and the preceding ray-shooting applications, we use the general, more abstract 
recipe for constructing good test sets. 

Fat triangle and circular cap range emptiness searching and reporting. Finally, we 
consider two planar instances of the range emptiness and reporting problems, in which we are given 
a planar set P of n points, and the ranges are either a-fat triangles or sufficiently large circular 
caps (say, larger than a semidisk). The general technique of Agarwal and Matousek [6] yields, for 
any class of planar ranges with constant description complexity, a data structure with near linear 
preprocessing and linear storage, which answers such queries in time 0*(n^/^) (for emptiness) or 
0*{n^^'^)+0{k) (for reporting). We improve the query time to 0*(1) and 0*{l)+0{k), respectively, 
in both cases. 

In these planar applications, we abandon the general recipe, and construct good test sets in 
an ad-hoc (and simpler) manner. For a-fat triangles (i.e., triangles with the property that each of 
their angles is at least a, which is some fixed positive constant), the test set consists of "canonical" 
(a/2)-fat triangles, and the fast query performance is a consequence of the fact that the complement 
of the union of m a'-fat triangles is O(mloglogm), for any constant q' > [35]. It is quite likely 
that our machinery can also be applied to other classes of fat objects in the plane, for which near- 
linear bounds on the complexity of their union are known |22^ |2^ [25| I26j . However, constructing 
a good test set for each of these classes is not an obvious step. We leave these extensions as open 
problems for further research. 

For circular caps, the motivation for range emptiness searching comes from the problem of 
finding, for a query consisting of a point q and a line £, the point of P which lies above i and 
is nearest to q (we only consider the case where q lies on or above £). Such a procedure was 
considered in [20] . Using parametric searching, the latter problem can be reduced to that of testing 
for emptiness of a circular cap centered at q and bounded by £ (the assumption on the location of 
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q ensures that this cap is at least a semidisk). Here too we manage to construct a test set which 
consists of (possibly slightly smaller) circular caps, and we exploit the fact that the complexity of 
the union of m such caps is 0*(m), as long as the caps are not too small (relative to their bounding 
circles), to obtain the fast performance stated above. 

Approximate range counting. Adapting the recent techniques of [12 ^ 113^ [T^. we can turn our 
solutions into efficient algorithms for approximate range counting (with small relative error) for 
the cases mentioned above. That is, for a specified e > 0, we can preprocess the input point set P 
into a data structure which can efficiently compute, for any query range 7, an approximate count 
t^, satisfying (1 — e)|Pn7| < (l + e)|Pn7|. The performance of the resulting algorithms is 
detailed in Section [3 As observed in the papers just cited, approximate range counting is closely 
related to the range emptiness problem, which in fact is a special case of the former problem. The 
algorithm in [12] performs approximate range counting by a randomized binary search over |Pn7|, 
where the search is guided by repeated calls to an emptiness testing routine on various random 
samples of P. This algorithm uses emptiness searching as a black box, so, plugging our solutions for 
this latter problem into their algorithm, we obtain efficient approximate range counting algorithms 
for the ranges considered in this paper. See Section [7] for details. 

Related work. Our study was originally motivated by work by Daescu and others [20^ [2T] on 
path approximations and related problems. In these applications one needs to compute efficiently 
the vertex of a subpath which is farthest from a given segment (connecting the two endpoints of 
the subpath). These works used the standard range searching machinery of [^, and motivated us 
to look for faster implementations. 

The general range emptiness (or reporting) problem was studied by the authors a few years 
ago [l2]. In this earlier version, we did not manage to handle properly the issue of constructing a 
good test set, so the results presented there are somewhat incomplete. The present paper builds 
upon the previous one, but provides a thorough analysis of this aspect of the problem, and conse- 
quently obtains a complete and efficient solution to the problems listed above, and lays down the 
foundation for obtaining efficient solutions to many other similar problems — we believe indeed that 
the applications given here only scratch the surface of the wealth of potential future applications 
of this sort. 

2 Preliminaries and notations 

We begin with a brief review of the main concepts and notations used in our analysis. 

Range spaces. A range space is a pair (X, F), where X is a set and F C 2"^ is a collection of 
subsets of X, called ranges. In our applications, X = W^, and T is a collection of semi-algebraic 
sets of some specific type, each having constant description complexity. That is, each set in F is 
given as a Boolean combination of a constant number of polynomial equalities and inequalities of 
constant maximum degree. To simplify the analysis, we assum^, as in [6], that all the ranges in 
F are defined by a single Boolean combination, so that each polynomial p in this combination is 
{d + t)-variate, and each range 7 has t degrees of freedom, so that if we substitute the values of 

'''This assumption is not essential, and is only made to simplify the presentation. 
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these t parameters into the last t variables of each p, the resulting Boolean combination defines 
the range 7. This allows us to represent the ranges of T as points in an appropriate t-dimensional 
parametric space. 

Under these special assumptions, the range space {X, T) has finite VC- dimension, a property 
formally defined in [27]. Informally, it ensures that, for any finite subset P of X, the number of 
distinct ranges of P is 0(|P|^), where 6 is the VC-dimension. 

As a matter of fact, we will consider range spaces of the form {P,Tp), where P CW'' is a finite 
point set, and each range in Tp is the intersection of P with a range in T. 

Cuttings. Given a finite collection F of n semi-algebraic ranges in M"^, as above, and a parameter 
r < n, a {l/r)-cutting for T is a partition H of M'^ (or of some portion of W^) into a finite number of 
relatively open cells of dimensions 0, 1, . . . , so that each cell is crossed by at most n/r ranges of 
r, where a range 7 G F is said to cross a cell o" if 7 n o" 7^ 0, but 7 does not fully contain a. We will 
also need to consider weighted (l/r)-cuttings, where each range 7 S T has a positive weight w{'^), 
and each cell of H is crossed by ranges whose total weight is at most W/r, where W = X^^gr ^(7) 
is the overall weight of all the ranges in F. 

Shallow ranges. A range 7 G F is called k-shallow with respect to a set P of points in W^, if 
|7nP| < k. 

Elementary cells. Define, as in [6], an elementary cell in M*^ to be a connected relatively open 
semi-algebraic set of some dimension k < d, which is homeomorphic to a ball and has constant 
description complexity. As above, we assume, for simplicity, that the elementary cells are defined 
by a single Boolean combination involving t free variables, and each cell is determined by fixing the 
values of these t parameters. 

Elementary cell partition. Let P be a set of n points in M'^. An elementary cell partition of 
P is a collection 11 = {(Pi, si), . . . , {Pm, Sm)}, for some integer m, such that (i) {Pi, . . . , Pm} is a 
partition of P (into pairwise disjoint subsets), and (ii) each Si is an elementary cell that contains 
the respective subset Pj. In general, the cells Si need not be disjoint. Usually, one also specifies a 
parameter r < n, and requires that n/r < |Pj| < 2n/r for each i, so m = 0{r). 

The function C(^)' Lemma 13.11 and Theorem 13.21 we use a function ^(r) that bounds the 
number of elementary cells in a decomposition of the complement of the union of any r ranges of 
F. We assume that ^(r) is "well behaved", in the sense that for each c > there exists c' > such 
that (^(c?') < c'C(r) for every r. We also assume that C{r) = 0,{r). 

{1^, a) -samples and shallow e-nets. We recall the result of Li et al. [32], and adapt it, similar 
to the recent observations in ^28j, to obtain a useful extension of the notion of e-nets. 

Let {X, TZ) be a range space of finite VC-dimension 5, and let < a, z/ < 1 be two given 
parameters. Consider the distance function 

\r — s I 

dyir, s) = , for r, s > 0. 

r + s + V 
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A subset N X IS called a (z/, a)-sample if for each R gTZ we have 

, flxriRl \NnR\\ 



Theorem 2.1 (Li et al. [32]) A random sample N of 

o{-^ (51ogi + log- 

elements of X is a {i',a)-sample with probability at least 1 — q. 

Har-Peled and Sharir [28] show that, by appropriately choosing a and i^, various standard 
constructs, such as e-nets and e-approximations, are special cases of {u, a)-samples. Here we follow 
a similar approach, and show the existence of small-size shallow e-nets, a new notation introduced 
in this paper. 

Let us first define this notion. Let {X, TZ) be a range space of finite VC-dimension 5, and let 

< e < 1 be a given parameter. A subset C X is a shallow e-net if it satisfies the following two 
properties, for some absolute constant c. 

(i) For each R e TZ and for any parameter t > 0, if \N f] R\ < t log i then \X n R\ < c{t + l)e|A|. 

(ii) For each R e TZ and for any parameter t > 0, if | A n i?| < te\X\ then \N f] R\ < c{t + 1) log j. 

Note the difference between shallow and standard e-nets: Property (i) (with t = 0) implies that 
a shallow e-net is also a standard e-net (possibly with a recalibration of e). Property (ii) has no 
parallel in the case of standard e-nets - there is no guarantee how a standard net interacts with 
small ranges. 

Theorem 2.2 A random sample N of 

O ( - ( Jlog- -Mog- 
\e \ e q 

elements of X is a shallow e-net with probability at least 1 — q. 

Proof: Take a = 1/2, say, and calibrate the constants in the size of A to guarantee, with probability 

1 — g, that N is an (e, l/2)-sample. Assume that this is indeed the case. For a range R £ TZ, put 
XR = \Xn R\/\X\ and Nr = \N n R\/\N\. We have 

4(A«,A«)-^^^^^^^<-. 

That is, 

|Ar- A^l < ^{XR + NR + e), 

or 

Xr < 3Nr + e, and, symmetrically, Nr < 3Xr + e. 

This is easily seen to imply properties (i) and (ii). For (i), let i? be a range for which | An/?! < t log ^; 
that is, Nr < jSte, for some absolute constant (3 (proportional to the VC-dimension). Then 

|A n ii| = |A| • Xr < \X\{3Nr + e) < {3pt + l)e|Aj. 
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For (ii), let i? be a range for which |X n i?| < ie|X|; that is, < te. Then 

\NnR\ = \N\ ■ Nr < \N\{3XR + e) < (3t + l)e|iV| < (3t + l)7log^, 
for another absolute constant 7 (again, proportional to the VC-dimension) . □ 

3 Semi-algebraic range reporting or emptiness searching 

Shallow cutting in the semi-algebraic case. We begin by extending the shallow cutting 
lemma of Matousek [33] to the more general setting of semi-algebraic ranges. This extension is 
fairly straightforward, although it involves several technical steps that deserve to be highlighted. 

Lemma 3.1 (Extended Shallow Cutting Lemma) Let T be a collection of n semi-algebraic 
ranges in W^. Assume that the complement of the union of any subset of m ranges in T can be 
decomposed into at most C,{m) elementary cells, for a well-behaved function Q as above. Then, for 
any r <n, there exists a (l/r)-cutting H with the following properties: 

(i) The union of the cells ofr. contains the complement of the union ofV. 

(ii) H consists of 0{C,{r)) elementary cells. 

(Hi) The complement of the union of the cells of 3 is contained in a union of 0{r) ranges in T. 

See Figured] for an illustration. 

Proof. The proof is a fairly routine adaptation of the proof in [33]. We employ a variant of the 
method of Chazelle and Friedman [18] for constructing the cutting. Let F' be a random sample of 
0{r) ranges of F, and let E' denote the complement of the union of F'. By assumption, E' can be 
decomposed into at most 0{C,{r)) elementary cells. The resulting collection H of these cells is such 
that their union clearly contains the complement of the union of F. Moreover, the complement of 
the union of H is the union of the 0(r) ranges of F'. Hence, H satisfies all three conditions (i)-(iii), 
but it may fail to be a (l/r)-cutting. 

This latter property is enforced as in |18j . by further decomposing each cell r of H that is 
crossed by more than n/r ranges of F, using additional subsamples from the surfaces that cross r. 
Specifically, for each cell r of H, let Ft- denote the subset of those ranges in F that cross r, and 
put = \Tr\r/n. If > 1, wc sample q = O(^T-log^T-) ranges from F,-, construct the complement 
of the union of these ranges, decompose it into at most C,{q) elementary cells, and clip them to 
within r. The resulting collection H' of subcells, over all cells r of the original H, clearly satisfies 
(i). The analysis of [18] (see also [8]) establishes an exponential decay property on the number of 
cells of S that are crossed by more than ^n/r ranges, as a function of ^. Specifically, as in [8], 
the expected number of such cells is 0(2~^E((^(|F"|)), where F" is another random sample of F, 
where each member of F is chosen with probability This property implies, as usual [18], that 
S' is (with high probability) a (l/r)-cutting, and it also implies that the size of r! is still 0(C(r)), 
assuming C, to be well behaved. Since we have only refined the original cells of H, the number of 
ranges that cover the complement of the union of the final cells is still 0(r). □ 

A special case that arises frequently is where each range in F is an upper (or lower) halfspace 
bounded by the graph of some continuous {d— l)-variate function. In this case the complement K 
of the union of r ranges is the portion of space that lies below the lower envelope of the bounding 
graphs. In this case, it suffices to decompose the graph of the lower envelope itself into at most 
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Figure 1: A planar point set and a collection T of upper halfp lanes. A random sample of the 
lines bounding these ranes is shown in bold, with a decomposition of the region below their lower 
envelope, which contains the region below the lower envelope of all the bounding lines, drawn 
shaded. 



C(r) elementary cells. Indeed, having done that, we can extend each cell r within the envelope into 
the cell T~ consisting of all points that lie vertically below r. The new cells decompose K and are 
also elementary. 

As already discussed in the introduction, obtaining tight or nearly tight bounds for ^(r) is still 
a major open problem for many instances of the above setup. For example, decomposing an upper 
envelope of r [d— l)-variate functions of constant description complexity into 0*{r'^~^) elementary 
cells is still open for any d > 4. (This bound is best possible in the worst case, since it is the 
worst-case tight bound on the complexity of such an undecomposed envelope [10].) The cases d = 2 
(upper envelope of curves in the plane) and ti = 3 (upper envelope of 2-dimensional surfaces in 
3-space) are easy. In these cases C,{r) is proportional to the complexity of the envelope, which in 
the worst case is near-linear for d = 2 and near-quadratic for d = 3 [40]. In higher dimensions, the 
only general-purpose bound known to date is the upper bound obtained by computing the vertical 
decomposition of the entire arrangement of the given surfaces, and extracting from it the relevant 
cells that lie on or above the envelope. In particular, for d = A the bound is C(r) = 0*(r^), as 
follows from the results of [30]. This leaves a gap of about a factor of r between this bound and the 
bound 0*{r^) on the complexity of the undecomposed envelope. Of course, in certain special cases, 
most notably the case of hyperplanes, as studied in [33], both the envelope and its decomposition 
have (considerably) smaller complexity. 

The situation with the complexity of the union of geometric objects is even worse. While 
considerable progress was recently made on many special cases in two and three dimensions (see 
[9] for a recent comprehensive survey), there are only very few sharp bounds on the complexity of 
unions in higher dimensions. Worse still, even when a sharp bound on the complexity of the union 
is known, obtaining comparable bounds on the complexity of a decomposition of the complement 
of the union is a much harder problem (in d > 3 dimensions). As an example, the union of n 
congruent infinite cylinders in 3-space is known to have near-quadratic complexity [10], but it is 
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still an open problem whether its complement can be decomposed into a near-quadratic number of 
elementary cells. 

Partition theorem for shallow semi-algebraic ranges. We next apply the new shallow cut- 
ting lemma to construct an elementary cell partition of a given input point set P, with respect to 
a specific set Q of ranges. This is done in a fairly similar way to that in [6] (see also \33\ 134]). A 
major difference in handling the semi-algebraic case is the construction of a set Q of ranges that will 
be (a) small enough, and (b) representative of all shallow (or empty) ranges, in a sense discussed 
in detail below. The method given in [6] does not work in the general semi-algebraic case, and 
different, sometimes ad-hoc approaches need to be taken. 

The following theorem summarizes the main part of the construction (except for the construction 
of Q). 

Theorem 3.2 (Extended Partition Theorem) Let P be a set of n points in M'^, let F be a 
family of semi-algebraic ranges of constant description complexity, and let r be fixed. Let Q be an- 
other finite collection (not necessarily a subset ofT) of semi-algebraic ranges of constant description 
complexity with the following properties: (i) The ranges in Q are all (n/r)- shallow, (ii) The com- 
plement of the union of any m ranges of Q can be decomposed into at most C("^) elementary cells, 
for any m. (Hi) Any {n/r) -shallow range 7 G F can be covered by the union of at most S ranges of 
Q, where 5 is a constant. 

Then there exists an elementary cell partition H of P, of size 0{r), into subsets of size roughly 
n/r, such that the crossing number of any {n/r)-shallow range in V with the cells 0/ 11 is either 
0{r / C,''^{r)+\ogr log \Q\), ifCi''') = i^{r^~^^), for any fixede > 0, or 0(r log r/C~"^(?")+logr log \Q\), 
otherwise. 

the proof, which, again, is similar to those in [U [331 El]) proceeds through the following steps. 
We first have: 

Lemma 3.3 Let P be a set of n points in M.'^, and r < n a parameter. Let Q be a set of (n/r)- 
shallow ranges, with the property that the complement of the union of any subset of m ranges of 
Q can be decomposed into at most C(™') elementary cells, for any m. Then there exists a subset 
P' P of at least n/2 points and an elementary cell partition H = {{Pi, si), . . . , {Pm, Sm)} for P' 
with \Pi\ = [n/r\ for all i, such that each range of Q crosses at most 0{r /C,~^{r) +log \Q\) cells Si 

o/n. 

Proof. We will inductively construct disjoint sets . . . ,Pm C P of size n/r and elementary cells 
si, . . . , such that Pi C si for each i. The construction terminates when |Pi U • • • U Prn\ > n/2. 
Suppose that Pi, ... , Pi-i have already been constructed, and set P/ := P\ Uj<i ^j- We construct 
Pj as follows: For a range a (z Q, let Ki{a) denote the number of cells among si, . . . , Si-i crossed 
by a. We define a weighted collection {Q,Wi) of ranges, so that each range a G Q appears with 
weight (or multiplicity) Wi{a) = 2'^'^'^). We put Wi{Q) = X^o-eQ ^«('^)- Lemma [3T] and by our 
assumption that the function (^(r) is well behaved, there exists a (l/t)-cutting Hj for the weighted 
collection {Q,Wi) of size at most r/4, for an appropriate choice oit = 0((^~^(r)), with the following 
properties: The union of contains the complement of the union of Q, and the complement of the 
union of is contained in the union of 0{t) ranges of Q. Since all these ranges are (n/r )-shallow, 
the number of points of P not in the union of Hj is at most 0{t) ■ {n/r) = n ■ 0{C,~^{r) /r), and 
our assumptions on ("(r) imply that this is smaller than n/4, if we choose t appropriately. Since we 
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assume that > ri/2, it follows that at least n/4 points of lie in the union of the at most r/4 
cells of Sj. By the pigeonhole principle, there is a cell Si of Hj containing at least n/r points of P/. 
We take Pi to be some subset of P/ H Sj of size exactly n/r, and make Sj the cell in the partition 
which contains Pj. 

We next establish the asserted bound on the crossing numbers between the ranges of Q and the 
elementary cells si, . . . , Sm, in the following standard manner. The final weight Wm{<y) of a range 
a E Q with crossing number k (with respect to the final partition) is 2'^. On the other hand, each 
newly added cell Si is crossed by ranges of Q of total weight 0{wi{Q)/C~^{r)), because Si is an 
elementary cell of the corresponding weighted (l/i)-cutting Hj. The weight of each of these crossing 
ranges is doubled at the i-th step, and the weight of all the other ranges remains unchanged. Thus 
Wi+i{Q) < 'Wi{Q){l + 0{1/C~^{r))). Hence, for each range a € Q we have 

< WmiQ) < \Q\ (l + O [-^z^)y < \Q\ (l + O {■^^)y^^^ ^ |Q|eO(^/^~^M), 
and thus k = log Wmio') = 0{r /(~^{r) + log \Q\). □ 

Discussion. The limitation of Lemma 13.31 is that the bound that it derives (a) applies only 
to ranges in Q, and (b) includes the term log \Q\. An ingenious component of the analysis in |33j 
overcomes both problems, by choosing a test set Q of ranges whose size is only polynomial in r (and, 
in particular, is independent of n), which is nevertheless sufficiently representative of all shallow 
ranges, in the sense that the crossing number of any (n/r)-shallow range is 0(max{K(cr) | a £ Q})- 
This implies that Lemma 13.31 holds for all shallow ranges, with the stronger bound which does not 
involve log \ Q\. 

Unfortunately, the technique of [33] does not extend to the case of semi-algebraic ranges, as it 
crucially relies on the linearity of the ranges|f| The following lemma gives a sufficient condition for 
a test set Q to be representative of the relevant shallow ranges, in the sense that Q satisfies the 
assumptions made in Theorem 13.21 That is: 

Lemma 3.4 Let P be a set of n points in M'^, and let T be a family of semi- algebraic ranges with 
constant description complexity. Consider an elementary-cell partition IT = {(Pi, si), . . . , (Pr, Sr)} 
of P such that |Pj| = n/r for each i. Let Q be a finite set of {n/r)-shallow ranges (not necessarily 
ranges ofV), so that the maximal crossing number of a range q Q with respect to H is k. Then, 
for any range 7 S L which is contained in the union of at most 6 ranges of Q (for some constant 
5), the crossing number of j is at most (k + 1)6. 

Proof. Let 7 S L be a range for which there exist 6 ranges qi, . . . , qs oi Q such that 'y qiD - ■ -Uqs. 
Then, if 7 crosses a cell Si of IT, then at least one of the covering ranges qj must either cross Sj or 
fully contain Sj. The number of cells of IT that can be crossed by any single is at most k, and 
each qj can fully contain at most one cell of IT (because qj is (n/r)-shallow)|3 Hence, the overall 
number of cells of IT that 7 can cross is at most (k + 1)6, as asserted. □ 

^It uses point-hyperplane duality, and exploits the fact that a halfspace (bounded by a hyperplane) intersects a 
simplex if and only if it contains a vertex of the simplex, which is false in the general semi-algebraic case. 

^By choosing a slightly smaller value for r in the construction of the partition, we can even rule out the possibility 
that a range qj fully contains a cell of 11. This however has no effect on the asymptotic bounds that the analysis 
derives. 
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Proof of Theorem 13.21 Apply Lemma 13.31 to the input set Pq = P, with parameter tq = r. 
This yields an elementary-cell partition XIq for (at least) half of the points of Pq, which satisfies the 
properties of that lemma. Let Pi denote the set of the remaining points of Pq, and set ri = ro/2. 
Apply Lemma [3.31 again to Pi with parameter ri, obtaining an elementary cell partition Hi for (at 
least) half of the points of Pi. We iterate this process k = O(logr) times, until the set P^ has fewer 
than n/r points. We take 11 to be the union of all the elementary-cell partitions Ilj formed so far, 
together with one large cell containing all the remaining points of P^. The resulting elementary-cell 
partition of P consists of at most l + r + r/2 + r/4-|-... < 2r subsets, each of size at most n/r. 
The crossing number of a range in Q is, by Lemma |3.3| 



Our assumptions on ^ imply that if C(r) = Q{r^^^), for any fixed e > 0, the first terms add up to 
0{r /(^~^{r)); otherwise we can bound their sum by 0(r log r/("~^(r)). Hence, by the properties of 
Q and by Lemma 13.41 the crossing number of any empty range is also 0{r /(^~^{r) + log \Q\ logr) 
or O (r log r/(~^{r) + log | Q \ log r) , respectively. □ 

Partition trees and reporting or emptiness searching. As in the classical works on range 
searching [6l[33l[3l], we apply Theorem 13.21 recursively, and obtain a partition tree T, where each 
node f of T stores a subset P^ of P and an elementary cell enclosing P„. The children of a node 
V are obtained from an elementary cell partition of P^ — each of them stores one of the resulting 
subsets of Py and its enclosing cell. At the leaves, the size of the subset that is stored is 0{r). 

Testing a range 7 for emptiness is done by searching with 7 in T. At each visited node u, where 
7 n cJt, 7^ 0, we test whether 7 5 cjt,, in which case 7 is not empty. Otherwise, we find the children 
of V whose cells are intersected by 7. If there are too many of them we know that 7 is not empty. 
Otherwise, we recur se at each child. 

Reporting is performed in a similar manner. If cr^ C 7 we output all of cr„. Otherwise, we find 
the children of v whose cells are intersected by 7. If there are too many of them we know that 7 is 
not (nt,/r )-shallow (with respect to P^), so, if r is a constant, we can afford to check every element 
of Py for containment in 7, and output those points that do lie in 7. If there are not too many 
children, we recurse in each of them. 

The efficiency of the search depends on the function C("^)- If C(^) = 0*{m^) then an emptiness 
query takes O* {n^^^^'^) time, and a reporting query takes 0*(ri^^^/^) + 0{t), where t is the output 
size. Thus making ^ (i.e., k) small is the main challenge in this technique. 

A general recipe for constructing good test sets. Let F be the given collection of semi- 
algebraic ranges of constant description complexity. As above, we assume that each range 7 G F 
has t degrees of freedom, for some constant parameter t, so it can be represented as a point 7* in a 
t-dimensional parametric space, which, for convenience, we denote as M*. Each input point p € P 
is mapped to a region Kp, which is the locus of all points representing ranges which contain p. 

We fix a parameter r > 1, and choose a random sample N of arlogr points of P, where a 
is a sufficiently large constant. We form the set A^* = {Kp \ p S N}, construct the arrangement 
A{N*), and let V = A<k{N*) denote the region consisting of all points contained in at most k 
ranges of N*, where k = 5 logr and b is an absolute constant that we will fix later. We decompose 
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V into elementary cells, using, e.g., vertical decomposition [30]. In the worst case, we get 0*{r'^^ ^) 
elementary cells [T71I30]. 1^ 

Let T be one of these cells. We associate with r a generalized range 7r in R*^, which is the union 
U{7 I 7* £ ''"}• Since r has constant description complexity, as do the ranges of F, it is easy to 
show that 7t- is also a semi-algebraic set of constant description complexity (see |16|). 

We define the test set Q to consist of all the generalized ranges 7^, over all cells r in the 
decomposition of V , and claim that, with high probability (and with an appropriate choice of 6), 
Q is a good test set, in the following three aspects. 

(i) Compactness. \Q\ = 0*(r^*~^); that is, the size of Q is polynomial in r and independent of n. 

(ii) Shallowness. Each range 7,- in Q is /3(n/r)-shallow with respect to P, for some constant 
parameter (3. 

(iii) Containment. Every {n/r)-shaUow range 7 G E is contained in a single range jr of Q. 

Property (i) is obvious. Consider the range space (P, F*), where E* consists of all generalized 
ranges 7,-, over all elementary cells r of the form arising in the above vertical decomposition. It 
is a fairly easy exercise to show that (P, E*) also has finite VC-dimension. See, e.g., [iO]. By 
Theorem 12.21 if a is a sufficiently large constant (proportional to the VC-dimension of (P, E*)) then 

is a shallow (l/r)-net for both range spaces (P, E) and (P, E*), with high probability, so we 
assume that N is indeed such a shallow (l/r)-net. 

Let 7t- G Q. Note that any point p G P in 7,- lies in a range 7 G E with 7* G r. By definition, 
7* also belongs to Kp, and so Kp crosses or fully contains r. Since r is (61ogr)-shallow in ^(A^*), 
it is fully contained in at most 61ogr regions Kp, for p ^ N (and is not crossed by any such region). 
Hence |7rnA^| < Mogr, so, since is a shallow (l/r)-net for (P, E*), we have |7T-nP| < c{h-\-l)n/r, 
so 7t- is (c(6 -|- l)n/r)-shallow, which establishes (ii). 

Eor (iii), let 7 G E be an (n/r)-shallow range. Since iV is a shallow (l/r)-net for (P, E), and 
I7 n P| < |P|/r, we have I7 fl A^j < 2c log r. Hence, with 6 > 2c, 7 G so there is a cell r of 
the decomposition which contains 7, which, by construction, implies that 7 C 7^, thus establishing 

(iii) . 

To make Q a really good test set, we also need the following fourth property: 

(iv) Efficiency. There exists a good bound on the associated function C,{m), bounding the size of 
a decomposition of the complement of the union of any m ranges of Q. 

The potentially rather complex shape of these generalized ranges makes it harder to obtain, in 
general, a good bound on C,. 

In what follows, we manage to use this general recipe in two of our four applications (ray shooting 
amid balls and range fullness searching), with good bounds on the corresponding functions (,{■)■ In 
two other planar applications (range emptiness searching with fat triangles and with circular caps), 
we abandon the general technique, and construct ad hoc good test sets. 

Remark: In the preceding construction, we wanted to make sure that every (n/r )-shallow range 
7 G E is covered by a range of Q. If we only need this property for empty ranges 7 (which is the 
case for emptiness testing), it suffices to consider only the 0-level of ^(A^*), i.e., the complement 
of the union of A^*. Other than this simplification, the construction proceeds as above. 

*Here, in this dual construction, we do not need any sharper bound; any bound polynomial in r is sufficient for 
our purpose. 
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4 Fullness searching and reporting outliers for convex ranges 



Let P be a set of n points in 3-space, and let F be a set of convex ranges of constant description 
complexity. We wish to preprocess P in near-linear time into a data structure of linear size, so 
that, given a query range 7 G F, we can efficiently determine whether 7 contains all the points 
of P. Alternatively, we want to report all the points of P that lie outside 7. (This is clearly a 
special case of range emptiness searching or range reporting, if one considers the complements of 
the ranges in F.) For simplicity, we only focus on the range fullness problem; the extension to 
reporting "outliers" is similar to the standard treatment of reporting queries, as discussed earlier. 

We present a solution to this problem, with 0*{n^/'^) query time, thereby improving over the 
best known general bound of 0*(n^/^), given in [6], which applies to any range searching (e.g., 
range counting) with semi-algebraic sets (of constant description complexity) in R^. 

To apply our technique to this problem we first need to build a good test set. Since fullness 
searching is complementary to emptiness searching, we need a property complementary to that 
assumed in Theorem 13.21 (see also Lemma l3.4p . In fact, we will enforce the property that every full 
range 7 fully contains a single range of Q, which is "almost full" (contains at least n — n/r points 
of P). 

As above, assuming the ranges of F to have t degrees of freedom, we map each range 7 G F 
to a point 7* in M*. A point p € is mapped to a region Kp which is the locus of all the 
points 7* that correspond to ranges 7 which contain p. We fix r < n, take a random sample 

of 0(r log r) points of P (with a sufficiently large constant of proportionality), construct the 
intersection I = HpGAr ^p-> ^'^d decompose it into elementary cells. For each resulting cell a, let 
7o- denote the intersection n'y*eo-7- As above, since a has constant description complexity, 70- is a 
semi-algebraic set of constant description complexity. Note that, since the ranges in F are convex, 
each range 7o- is also convex (albeit of potentially more complex shape than that of the original 
ranges) . 

Define the test set Q to consist of all the generalized ranges 70-, over all cells a in the de- 
composition of /. We argue that Q satisfies all four properties required from a good test set: (i) 
Compactness: As above, the size of Q is polynomial in r (it is at most 0*(r^*~^)). (ii) Shallowness 
(or, rather, "almost fullness"): For each cell a and any 7 G F with 7* G a, 7* lies in all the sets Kp, 
for p ^ N, and thus N Q j. By construction, we also have C 7^.. Apply the e-net theory [27] to 
the range space {P,T), where the ranges of F are complements of ranges of the same form as the 
ranges 70-. Since 7^ R A^ = for each cell a in the decomposition, we have, with high probability, 
the property that for each cell a of /, ja contains at least n — n/r points of P, so it is an "almost 
full" range, (iii) Containment: Let 7 G F be a full range. Then, in particular, N j. Then 7* G /, 
and let a be the cell of / containing 7*. Then, by construction, 70- C 7. (iv) Efficiency: Finally, we 
show that the complexity of a decomposition of the intersection of any m ranges in Q, is 0*(m^), 
so C("^) = 0*{m?). 

Claim 4.1 Let Q he a set of convex ''almost full" ranges, each containing at least n — n/r points of 
P. The intersection, K, of any m ranges qi, . . . ,qm G Q can be decomposed into 0*(m^) elementary 
cells. 

Proof. Since all ranges in Q are convex, is a convex set too. Assume, for simplicity of presentation, 
that K is nonempty and has nonempty interior, and fix a point in that interior. We can regard the 
boundary of each as the graph of a bivariate function p = Fi{6, ip) in spherical coordinates about 
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o. Then dK is the graph of the lower envelopes of these functions. Since the g^'s have constant 
description complexity, (the graph of) each Fi is also a semi-algebraic set of constant description 
complexitjU. Hence the combinatorial complexity of dK is 0*(m^) [ID]. Moreover, since dK is 
2-dimensional, we can partition it into 0*{m?) trapezoidal-like elementary cells, using a variant of 
the vertical decomposition technique, and then extend each such cell tq to a 3-dimensional cone-like 
cell r, which is the union of all segments connecting o to the points of tq. The resulting cells r 
constitute a decomposition of K into 0*(m^) elementary cells, as claimed. □ 

Using the machinery developed in the preceding section, we therefore obtain the following result. 



Theorem 4.2 Let P be a set of n points in M.^, and let T be a family of convex ranges of constant 
description complexity. Then one can construct, in near linear time, a data structure of linear size 
so that, for any range j €T, it can determine, in 0*(n^/^) time, whether 7 is full. 

Reporting outliers. To extend the above approach to the problem of reporting outliers, we 
apply a construction similar to that in the "general recipe" presented above. That is, we take the 
61ogr deepest levels of ^(A^), for an appropriate constant b, decompose them into elementary cells, 
and construct a generalized range 70- for each of these cells a. The general machinery given above 
implies the following result: 

Theorem 4.3 Let P be a set of n points in R^, and let T be a family of convex ranges of constant 
description complexity. Then one can construct, in near linear time, a data structure of linear size 
so that, for any range 7 S T, can report the points of P in the complement of^, in O* {n^^'^) + 0{k) 
time, where k is the query output size. 

4.1 Farthest point from a convex shape 

A useful application of the data structure of Theorem 14.21 is to farthest point queries. In such a 
problem we are given a set P of n points in M^, and wish to preprocess it, in near-linear time, into 
a data structure of linear size, so that, given a convex query object o (from some fixed class of 
objects with constant description complexity), we can efficiently find the point of P farthest from 
o. 

We solve this problem using parametric searching [36]. The corresponding decision problem is: 
Given the query object o and a distance p, determine whether the Minkowski sum o © -Bp is full, 
where Bp is the ball of radius p centered at the origin. The smallest p with this property is the 
distance to the farthest point from o. With an appropriate small-depth parallel implementation of 
this decision problem, the parametric searching also takes time 0*(n^/^). Reporting the k farthest 
points from o, for any parameter k, can be done in 0*(n^/^) + 0{k) time, using a simple variant of 
this technique. 



4.2 Computing the largest-area, largest-perimeter, and largest-height triangles 

Let P be a set of n points in M^. We wish to find the triangle whose vertices belong to P and 
whose area (respectively, perimeter, height) is maximal. This problem is a useful subroutine in 

^With an appropriate algebraic re-parametrization of the spherical coordinates, of course. 
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path approximation algorithms; see |21j . Daescu and Serfling |21j gave an 0*(n^^/^ )-algorithm for 
the 3-dimensional largest-area triangle. In d dimensions, the running time is 0*(n^~^/(L'^^/^-l^^^). 

In R'^, our technique, without any additional enhancements, yields the improved bound 0*(n^/^), 
using the following straightforward procedure. For each pair of points pi,P2 £ P, we find the far- 
thest point q & P from the line pTp2, compute the area of Apip2q, and output the largest-area 
triangle among those triangles. The procedure performs farthest-point queries from 0{'n?) lines, 
for a total cost of 0*(n^/^), as claimed. 

We can improve this solution, using the following standard decomposition technique, to an algo- 
rithm with running time 0*(n^^/^^). First, the approach just described performs M farthest-point 
queries on a set of points in time 0*{MN^/'^ + N), where the second term is the preprocessing 
cost of preparing the data structure. 

Before continuing, we note the following technical issue. Recall that we find the farthest point 
from a query line i by drawing a cylinder Cp around £, whose radius p is the smallest (unknown) 
radius for which Cp contains P. The concrete value of p is found using parametric searching. In 
the approach that we follow now, we will execute in parallel 0{'n?) diff'erent queries, each with its 
own p, so care has to be taken when running the parametric search with this multitude of different 
unknown values of p. 

While there are several alternative solutions to this problem, we use the following one, which 
seems the cleanest. Let A > be a fixed parameter. For each pair pi,p2 of distinct points of P, let 
Ca{piP2) denote the cylinder whose axis passes through pi and p2 and whose radius is 2A/\pip2\. 
In the decision procedure, we specify the value of A, and perform O(n^) range fullness queries 
with the cylinders Ca{piP2)- If all of them are found to be full, then A > A* , where A* is the 
(unknown) maximal area of a triangle spanned by P; otherwise A < A* . (With a somewhat finer 
implementation, we can also distinguish between the cases A > A* and A = A*; we omit the details 
of this refinement.) 

To implement the decision procedure, we apply a duality transform, where each cylinder C in 
3-space is mapped to a point C* = {a,b,c,d,p), where {a,b,c,d) is some parametrization of the 
axis of C and p is its radius. In this dual parametric 5-space, a point p E is mapped to a surface 
p*, which is the locus of all (points representing) cylinders which contain p on their boundary. 
Note that the portion of space below (resp., above) p*, in the p-direction, consists of points dual 
to cylinders which do not contain (resp., contain) p. 

Let P* = {p* I p € P}. Fix some sufficiently large but constant parameter tq, and construct 
a (l/ro)-cutting H of A{P*), using the vertical decomposition of a random sample of O(rologro) 
surfaces of P* (see, e.g., |lQ]). As follows from [171 [30], the combinatorial complexity of H is 0*{rQ). 
We distribute the O(n^) points dual to the query cylinders among the cells of S, in brute force, and 
also find, in equally brute force, for each cell r the subset P* of surfaces which cross r. We ignore 
cells which fully lie below some surface of P*, because cylinders whose dual points fall in such a 
cell cannot be full (the decision algorithm stops as soon as such a point (cylinder) is detected). For 
each of the remaining cells r, we repeat this procedure with the subset of the points dual to the 
surfaces in P* and with the subset of cylinders whose dual points lie in r. We keep iterating in 
this manner until we reach cuttings whose cells are crossed by at most n/r dual surfaces, where r 
is some (non-constant) parameter that we will shortly fix. As is easily checked, the overall number 
of cells in these cuttings is 0*{r^). 

We then run the preceding weaker procedure on each of the resulting cells r, with the set Pr 
of points dual to the surfaces which cross r and with the set Cr of cylinders whose dual points lie 
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in T. Letting rrir denote the number of these cyhnders, the overall cost of the second phase of the 
procedure is 

0*{m^{n/ry/^ + n/r) = O* (n^injrfl'^ + nr^). 

T 

Since tq is a constant, the overall cost of the first phase is easily seen to be proportional to the 
overall size of the resulting subproblems, which is 0*(7i^ + nr^). Overall, the cost is thus 

Choosing r = r?!"^^ ^ this becomes 0*(n^^/"'^^). 

Running a generic version of this decision procedure in parallel is fairly straightforward. The 
cuttings themselves depend only on the dual surfaces, which do not depend on A*, so we can 
construct them in a concrete, non-parametric fashion. Locating the points dual to the query 
cylinders can be done in parallel, and, since tq is a constant, this takes constant parallel depth 
for each of the logarithmically many levels of cuttings. The second phase can also be executed 
in parallel in an obvious manner. Omitting the further easy details, we conclude that the overall 
algorithm also takes 0*(n^^/^^) time. 

Largest-perimeter triangle. The above technique can be adapted to yield efficient solutions of 
several problems of a similar flavor. For example, consider the problem of computing the largest- 
perimeter triangle among those spanned by a set P of n points in M^. Here, for each pair pi,P2 
of points of P, and for a specified perimeter vr, we construct the ellipsoid of revolution Et^{pi,P2), 
whose boundary is the locus of all points q satisfying \qpi \ + \qp2\ = tt — |piP2|- (Here, of course, we 
only consider pairs pi,P2 with \piP2\ < 7r/2.) We now run O(n^) range fullness queries with these 
ellipsoids, and report that vr* > tt if at least one of these ellipsoids in not full, or vr* < vr otherwise, 
where vr* is the largest perimeter. 

The efficient implementation of this procedure is carried out similar to the preceding algorithm, 
except that here the dual representation of our ellipsoids require six degrees of freedom, to specify 
the foci pi and p2- Unlike the previous case, the dual surfaces p* do depend on vr, so, in the generic 
implementation of the decision procedure we also need to construct the various (l/ro)-cuttings in 
a generic, parallel mannerly However, since rg is a constant, this is easy to do in constant parallel 
depth per cutting. A (l/ro)-cutting in has complexity 0*(r®) [n\ I30j . A modified version of 
the preceding analysis then yields: 

Theorem 4.4 The largest-perimeter triangle among those spanned by a set of n points in U.^ can 
be computed in 0*{n^^^^) time. 

Largest-height triangle. In this variant, we wish to compute the triangle with largest height 
among those determined by a set P of n points in M^. Here, for each pair pi,p2 of points of P, and 
for a specified height h, we construct the cylinder Ch{pi,p2), whose axis passes through pi and p2 
and whose radius is h. We run 0{n?) range fullness queries with these cylinders, and report that 
h* > h \i at least one of these cylinders in not full, or h* < h otherwise, where h* is the desired 
largest height. 

^"We can make these surfaces independent of vr if we add tt as a seventh degree of freedom, but then the overall 
performance of the algorithm deteriorates. 
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The efficient implementation of this procedure is carried out as above, except that here the dual 
representation of these cyhnders require only four degrees of freedom, once h is specified. As in the 
preceding case, here too the surfaces of P* also depend on h, so we need a generic parallel procedure 
for constructing (l/ro)-cuttings for these surfaces, which however is not difficult to achieve, since 
ro is a constant. We omit the simple routine details. Since a (l/ro)-cutting in has complexity 
0*{r^) [30], a modified version of the preceding analysis then yields: 

Theorem 4.5 The largest-height triangle among those spanned by a set of n points in R'^ can be 
computed in 0*{n^^/'^) time. 

Further extensions. We can extend this machinery to higher dimensions, although its perfor- 
mance deteriorates as the dimension grows. The range fullness problem in W^, for d > 4, can 
be handled in much the same way as in the 3-dimensional case. When extending Claim 14.11 we 
have an intersection of m convex sets of constant description complexity in M*^, and we can regard 
the boundary of the intersection as the lower envelope of m {d — l)-variate functions of constant 
description complexity, each representing the boundary of one of the input convex sets, in spherical 
coordinates about some fixed point in the intersection. The complexity of the lower envelope is 
0*(m'^~^) [39] • However, we need to decompose the region below the envelope into elementary 
cells, and, as already noted, the only known general-purpose technique for doing so is to decompose 
the entire arrangement of the graphs of the m boundary functions, and select the cells below the 
lower envelope. The complexity of such a decomposition is 0*{m'^'^~^) [n\ I30j. This implies that 
C{r) = 0*{r^'^~^). The rest of the analysis, including the construction of a good test set, is done in 
essentially the same manner. Hence, using the machinery of the previous section, we obtain: 

Theorem 4.6 Let P be a set of n points in M.'^, for d > 4, and let T be a family of convex ranges 
of constant description complexity. Then one can construct, in near linear time, a data structure 
of linear size so that, for any range 7 S T, it can determine, in 0*{n^~^^^'^'^~^^), whether 7 is full. 

Finding the largest-area triangle in W^. Let P be a set of n points in R^, for d > 4, and 

consider the problem of finding the largest-area triangle spanned by P. We apply the same method 
as in the 3-dimensional case, whose main component is a decision procedure which tests O(n^) 
cylinders for fullness. A cylinder (with a line as an axis) in R"^ has 2d — 1 degrees of freedom, so 
the dual representation of our O(n^) cylinders is as points in R^"^"^. The best known bound on 
the complexity of a (l/r)-cutting in this space is 0*(r^(^'^~^)~^) = 0*(r^'^~^). Applying this bound 
and the bound in Theorem 14.61 the overall cost of the decision procedure is 



Optimizing the value of r, and applying parametric searching, we get an algorithm for the maximum- 
area triangle in R'^ with running time 



We can extend the other problems (largest-perimeter or largest-height triangles) in a similar man- 
ner, and can also obtain algorithms for solving higher-dimensional variants, such as computing 
the largest-volume tetrahedron or higher-dimensional simplices. We omit the straightforward but 
tedious analysis, and the resulting cumbersome-looking bounds. 
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5 Ray shooting amid balls in 3-space 



Let B he a set of n balls in 3-space. We show how to preprocess B in near-linear time into a data 
structure of linear size, so that, given a query ray p, the first ball that p hits can be computed in 
0*(n^/^) time, improving the general bound 0*{n^^^) mentioned in the introduction. As already 
noted, we use the parametric-searching technique of Agarwal and Matousek [5], which reduces the 
problem to that of efficiently testing whether a query segment s = qz C p intersects any ball in B, 
where q is the origin of p and 2; is a parametric point along p. 

Parametric representation of balls and segments. We move to a parametric 4-dimensional 
space, in which balls in 3-space are represented by points, so that a ball with center at (a, b, c) and 
radius r is mapped to the point {a,b,c,r). A segment e, or for that matter, any closed nonempty 
set if C of constant description complexity, is mapped to a surface ax, which is the locus of all 
points representing balls that touch K but are openly disjoint from K. By construction, ax is the 
graph of a totally defined continuous trivariate function r = aKio,,b,c), which is semi-algebraic of 
constant description complexity. Moreover, points below (resp., above) ax represent balls which 
are disjoint from K (resp., intersect K). 

Moreover, for any such set K, crxiq) is, by definition, the (Euclidean) distance of q from K. 
Hence, given a collection IC = {Ki, K2, ■ ■ ■ , Km} of m sets, the minimization diagram of the surfaces 
c^Km (that is, the projection onto the 3-space r = of the lower envelope of these surfaces) 
is the nearest-neighbor Voronoi diagram of IC. We use this property later on, in deriving a sharp 
bound on the resulting function Ci')- 

Building a test set for segment emptiness. Here we use the general recipe for constructing 
good test sets, which covers each empty segment e by a fairly complex "canonical" empty region 
K, which has nonetheless constant description complexity. In parametric 4-space, each such region 
K is mapped to the upper halfspace above the corresponding surface ax', this is the set of all balls 
that intersect K. The complement of the union of m such ranges is the portion of 4-space below the 
lower envelope of the corresponding surfaces ■ Using the connection between this envelope and 
the Voronoi diagram of the -ftTj's, we are able to decompose (the diagram and thus) the complement 
of the union into C(?^) = 0*{m^) elementary cells. 

in more detail, the construction proceeds as follows. We start by choosing a random sample 
N of 0(r log r) balls of B, to construct a test set Q for empty segment ranges, with respect to 
N. While we do not have a clean, explicit geometric definition of these ranges, they will satisfy, 
as above, all the four requirements from a good test set. Also, we spell out the adaptation of the 
general recipe to the present scenario, to help the reader see through one concrete application of 
the general recipe. 

Specifically, we move to a dual space, in which segments in 3-space are represented as points. 
Segments in 3-space have six degrees of freedom; for example, we can represent a segment by the 
coordinates of its two endpoints. The dual space is therefore 6-dimensional. Each ball B N is 
mapped to a surface B*, which is the locus of all points representing segments which touch dB but 
do not penetrate into its interior; that is, either they are tangent to B, at a point in their relative 
interior, or they have an endpoint on dB but are openly disjoint from B. 

Let A^* denote the collection of the surfaces dual to the balls of N. Construct a (l/r)-cutting 
of ^(A^*), which consists of 0*(r^) elementary cells [17^ I30j. Each cell r has the property that all 
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points in r represent segments which meet a fixed set of balls from among the balls in N and avoid 
all other balls of N; the set depends only on r. 

For each cell r whose corresponding set of balls is empty, we define to be the union, in 
3-space, of all segments e whose dual points lie in r. Since r is an elementary cell, is a semi- 
algebraic set of constant description complexity (see, e.g., [E]). Moreover, K.^ is an N -empty 
region, in the sense that it is openly disjoint from all the balls in A^. 

Since we have to work in parametric 4-space, we map each region K^- of the above kind into 
a range 'jr in 4-space, which is the locus of all (points representing) balls which intersect Kr- As 
discussed above, ■jr is the upper halfspace bounded by the graph of the distance function from 
points in to Kr- 

We define the desired test set Q to consist of all the ranges 7t-, as just defined, and argue that 
Q indeed satisfies all four properties required from a good test set: (a) Compactness: \Q\ = 0*(r®), 
so its size is small, (b) Shallowness: With high probability, each range in Q is (n/r)-shallow, since 
it does not contain any point representing a ball in (and we assume that the sample N does 
indeed have this property, which makes all the ranges in Q (n/r)-shallow). (c) Containment: Each 
empty segment is also A^-empty, so its dual point lies in some cell r of the cutting, whose associated 
subset of balls is empty. By construction, we have e C Kr- That is, any ball intersecting e also 
intersects K^-, so the range in 4-space that e defines is contained in 7^, i.e., in a single range of Q. 
(d) Efficiency: The complement of the union of any m ranges in Q can be decomposed into 0*{m^) 
elementary cells. 

The proof of (d) proceeds as follows. The complement of the union of m ranges, 7^-^ , . . . , 7r„, , is 
the region below the lower envelope of the corresponding surfaces crxri ) ■ ■ ■ > '^K-,-^ ■ To decompose 
this region, it suffices to produce a decomposition of the 3-dimensional minimization diagram of 
these surfaces, and extend each of the resulting cells into a semi-unbounded vertical prism, whose 
"ceiling" lies on the envelope. 

The combinatorial complexity of the minimization diagram of a collection /C = {i^n , ■ ■ ■ , K^-^ } 
of m trivariate functions of constant description complexity i j^ 0*(m^) [lO]. Moreover, as noted 
above, the minimization diagram is the Euclidean nearest-neighbor Voronoi diagram of IC. 

We can decompose each cell Vi = V{Kt^) of the diagram (or, more precisely, the portion 
of the cell outside the union of the K^-^^s) using its star-shapedness with respect to its "site" 
Kr^; that is, for any point p € V{Kri), the segment connecting p to its nearest point on is 
fully contained in l/(i^T-.). As is easy to verify, this property holds regardless of the shape, or 
intersection pattern, of the regions in IC. We first decompose the 2-dimensional faces bounding Vi 
into elementary cells, using, e.g., an appropriate variant of 2-dimensional vertical decomposition, 
and then take each such cell (po and extend it to a cell 4>, which is the union of all segments, each 
connecting a point in (j)Q to its nearest point on Kr^. The resulting cells, obtained by applying 
this decomposition to all cells of the diagram, form a decomposition of the portion of the diagram 
outside the union of the Kj-.^s, into a total of 0*{m^) elementary cells, as desired. The union of 
the K-j-i 's themselves, being a subcollection of cells of a 3-dimensional arrangement of m regions of 
constant description complexity, can also be decomposed into 0*(m^) cells, using standard results 
on vertical decomposition in three dimensions |40j . 

Using Lemma [3.41 and the machinery of Section [3l in conjunction with the parametric searching 
technique of [5], we thus obtain the following theorem. 

^^This bound holds regardless of how "badly" the regions in K. are shaped, nor how "wildly" they can intersect 
one another, as long as each of them has constant description complexity. 
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Theorem 5.1 Ray shooting amid n halls in 3-space can be performed in 0*{n'^^^) time, using a 
data structure of 0{n) size, which can be constructed in 0*{n) time. 

Remark: In the preceding description, we only considered empty ranges. If desired, we can 
extend the analysis to obtain a data structure which also supports "reporting queries", in which 
we want to report the first k balls hit by a query ray. We omit the details of this straightforward 
extension. 

6 Range emptiness searching and reporting in the plane 

Fat triangle reporting and emptiness searching. Let a > be a fixed constant, and let P 
be a set of n points in the plane. We wish to preprocess P, in 0*{n) time, into a data structure 
of size 0(n), which, given an a-fat query triangle 7 (which, as we recall, is a triangle all of whose 
angles are at least a), can determine in 0*(1) time whether 7 n P = 0, or report in 0*(1) + 0{k) 
time the points of P in 7, where k = |P n 7]. 

To do so, we need to construct a good test set Q. We use the following "canonization" process 
(an ad- hoc process, not following the general recipe of Section [3]). As above, we apply the construc- 
tion to a random sample N of 0(r log r) points of P. For simplicity, we first show how to canonize 
empty triangles, and then extend the construction to shallow triangles. (As before, the first part 
suffices for emptiness searching, whereas the second part is needed for reporting queries.) Let A be 
an a- fat empty triangle, which is then also A^-empty. We expand A homothetically, keeping one 
vertex fixed and translating the opposite side away, until it hits a point qi of A^. We then expand 
the new triangle homothetically from a second vertex, until the opposite side hits a second point 
q2 of A^, and then apply a similar expansion from the third vertex, making the third edge of the 
triangle touch a third point of A^. We end up with an A^-empty triangle A', homothetic to, and 
containing. A, each of whose sides passes through one of the points qi,q2,Q3 G A^- See Figure EJ 
(It is possible that some of these expansions never hit a point of A^, so we may end up with an 
unbounded wedge or halfplane instead of a triangle. Also, the points qi,q2, qs need not be distinct.) 



/\ A' 




Figure 2: The first step in canonizing an empty triangle. 

Let P be the set of orientations {ja/4 | j = 0, 1, . . . , [Svr/aJ}. We turn the side containing qi 
clockwise and counterclockwise about qi, keeping its endpoints on the lines containing the other 
two sides, until we reach an orientation in T>, or until we hit another point of A^ (which could also 
be one of the points 92; 93); whichever comes first. Each of the new sides forms, with the two lines 
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containing the two other sides, a new (openly) A^-empty (3a/4)-fat triangle; the union of these two 
triangles covers A'. See Figure El 




Figure 3: The second step in canonizing an empty triangle. 

For each of the two new triangles, A", we apply the same construction, by rotating the side 
containing 52 clockwise and counterclockwise, thereby obtaining two new triangles whose union 
covers A". We then apply the same construction to each of the four new triangles, this time 
rotating about q^. Overall, we get up to eight new triangles whose union covers A. Each of these 
new triangles is (a/2)-fat, openly A^-empty, and each of its sides either passes through two points 
of A^, or passes through one point of N and has orientation in V. Since = 0{l/a) = 0(1), it 
follows that the overall number of these canonical covering triangles is 0((r log r)^) = 0*(r^). (We 
omit the easy extensions of this step to handle unbounded wedges or halfplanes, or the cases where 
the points qi, or some of the newly encountered points, lie at vertices of the respective triangles.) 

We take Q to be the collection of these canonical triangles, and argue that Q indeed satisfies the 
properties of a good test set: (a) Compactness: \Q\ = 0*{r^), so its size is small, (b) Shallowness: 
With high probability, each range in Q is (n/r )-shallow (and, as usual, we assume that this property 
does indeed hold), (c) Containment: By construction, each a-fat empty triangle is contained in the 
union of at most eight triangles in Q. (d) Efficiency: Being (a/2)-fat, the union of any m triangles in 
Q has complexity 0(m log log m) [35], so the associated function ( satisfies C('t^) = 0{mloglogm). 
This, combined with Lemma 13.41 and the machinery of Section [3l lead to the following theorem. 

Theorem 6.1 One can preprocess a set P of n points in the plane, in near-linear time, into a data 
structure of linear size, so that, for any query a-fat triangle A, one can determine, in 0*(1) time, 
whether /\r\P = 

Reporting points in fat triangles. We can extend the technique given above to the problem 
of reporting the points of P that lie inside any query fat triangle. For this, we need to construct a 
test set that will be good for shallow ranges and not just for empty ones. Using Theorem 12.21 
construct (by random sampling) a shallow (l/r)-net N C P of size 0(r log r). We next canonize 
every (n/r)-shallow a-fat triangle A, by the same canonization process used above, with respect 
to the set N. Note that each of the resulting canonical triangles contains (in its interior) the same 
subset of iV as A does. By the properties of shallow (l/r)-nets, since |A fi P| < n/r, we have 
|A n A^l = O(logr), so all the resulting canonical triangles are (clog r)-shallow with respect to N, 
for some absolute constant c. Again, since A^ is a shallow (l/r)-net, all the canonical triangles are 
(c'n/r )-shallow with respect to P, for another absolute constant c'. Hence, the resulting collection Q 
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of canonical triangles is a good test set for all shallow fat triangles, and we can apply the machinery 
of Section [3] to obtain a data structure of linear size, which can be constructed in near-linear time, 
and which can perform reporting queries in fat triangles in time 0*{l) + 0{k), where k is the output 
size of the query. 

Range emptiness searching with semidisks and circular caps. The motivation for studying 
this problem comes from the following problem, addressed in [20]. We are given a set P of n points 
in the plane, and wish to preprocess it into a data structure of linear size, so that, given a query 
point q and a query line i, one can quickly find the point of P closest to q and lying above i. In 
the original problem, as formulated in [20], one also assumes that q lies on £, but we will consider, 
and solve, the more general version of the problem, where q also lies above i. 

The standard approach (e.g., as in [6]) yields a solution with linear storage and near- linear 
preprocessing, and query time 0*(n^/^). We present a solution with query time 0*(1). 

Using parametric searching [36] , the problem reduces to that of testing whether the intersection 
of a disk of radius p centered at q with the halfplane i'^ above I is P-empty. The resulting range 
is a circular cap larger than a semidisk (or exactly a semidisk if q lies on £). Again, the main task 
is to construct a good test set Q for such ranges, which we do by using an ad-hoc canonization 
process, which covers each empty circular cap by 0(1) canonical caps, which satisfy the properties 
of a good test set; in particular, we will have C("^) = 0*{m). (As before, we consider here only the 
case of emptiness detection, and will consider the reporting problem later.) 

To construct a test set Q we choose a random sample N of 0(r log r) points of P and build a 
set of canonical empty ranges with respect to N. Let C = Cc^p/ be a given circular cap (larger 
than a semidisk) with center c, radius p, and chord supported by a line i. We first translate £ in 
the direction which enlarges the cap, until either its portion within the disk D of the cap touches 
a point of A'", or i leaves D. See Figure IH|left). In the latter case, C is contained in a complete 
A^-empty disk, and it is fairly easy to show that such a disk is contained in the union of at most 
three canonical A^-empty disks, each passing through three points of A^ or through two diametrically 
opposite points of A^; there are at most 0*{r^) such disks. 




Figure 4: The first steps in canonizing an empty cap. 

Suppose then that the new chord (we continue to denote its line as i) passes through a point q of 
A^, as in the figure. Let P be a set of 0(1) canonical orientations, uniformly spaced and sufficiently 
dense along the unit circle, for some small constant value a. Rotate i about q in both clockwise 
and counterclockwise directions, until we reach one of the two following events: (i) the orientation 
of i belongs to D; or (ii) the portion of £ within D touches another point of A^. In either case, the 
two new lines, call them £i,£2, become canonical — there are only 0*(r^) such possible lines. Note 
that our original cap C is contained in the union Ci U C2, where Ci = Cc,p/i and C2 = Cc,p/2- 
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Moreover, although the new caps need no longer be larger than a semidisk, they are not much 
smaller — this is an easy exercise in elementary geometry. See Figure [Upright). 

We next canonize the disk of C (which is also the disk of Ci and C2). Fix one of the new caps, 
say Ci. Expand Ci from the center c, keeping the line ii fixed, until we hit a point qi of (lying 
in t^). See Figure [5]^left). If c lies in l'^ then we move c parallel to li in both directions, again 
keeping itself fixed and keeping the circle pass through qi, until we obtain two circular caps, 
each passing through qi and through a second point of N (if we do not hit a second point, we reach 
a quadrant, bounded by i and by the line orthogonal to I through qi). The union of the two new 
circular caps covers Ci. See Figure fright). 



If c lies in we move it along the two rays connecting it to the endpoints uq, vq of the chord 
defined by ii. As before, each of the motions stops when the circle hits another point of N in if , or 
when the motion reaches uq or vq. We claim that Ci is contained in the union of the two resulting 
caps. Indeed, let u and v denote the locations of the center at the two stopping placements. We 
need to show that, for any point b £ Ci we have either \bu\ < \qiu\ or \bv\ < \qiv\. If both 
inequalities did not hold, then both u and v would have to lie on the side of the perpendicular 
bisector of qib containing qi. This is easily seen to imply that c must also lie on that side, which 
however is impossible (because \bc\ < \qic\). See Figure El 

Next, we take one of these latter caps, C", whose bounding circle passes through qi and through 
a second point q2 of N n if , and move its center along the bisector of qiq2 in both directions, 
keeping the bounding circle touch qi and q2, and still keeping the line ii supporting the chord 
fixed. We stop when the first of these events takes place: (i) The center reaches ii, in which case 
the cap becomes a semidisk (this can happen in only one of the moving directions), (ii) The center 
reaches the midpoint of qiq2- (iii) The bounding circle touches a third point of N n if . (iv) The 
central angle of the chord along ii is equal to some fixed positive angle f3 > 0. The union of the 
two new caps covers C. (It is possible that during the motion the moving circle becomes tangent 
to ii, and then leaves it, in which case the corresponding final cap is a full disk.) 

Similarly, if the center of C lies on ii (which can happen when the motion in the preceding 




Figure 5: The second step in canonizing an empty cap. 
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Figure 6: Ci is contained in the union of the two other caps. 
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canonization step reaches vq or uq), then we canonize its disk by translating the center to the left 
and to the right along li until the bounding circle touches another point oiNnif, exactly as in 
the preceding case (shown in Figure [5][^right)). 

Let C" be one of the four new caps. In all cases C" is canonical: For the first kind of caps, the 
stopping condition that defines C" is (ii) or (iii) then either the circle bounding C" passes through 
three points of N or it passes through two diametrically opposite points of A^. There are a total 
of 0*(r^) such circles, and since C" is obtained (in a unique manner) by the interaction of one of 
these circles and one of the 0*(r^) canonical chord-lines, there is a total of 0*(r^) such caps. If the 
stopping condition is (i), the cap is also canonical, because the center of the containing disk is the 
intersection point of a bisector of two points of N with one of the 0*(r^) canonical chord-lines, so 
there is a total of 0*(r^) such caps. In the case of condition (iv), there are only 0*(r^) such disks, 
for a total of 0*{r'^) caps. Similar reasoning shows that the caps resulting in the second case are 
also canonical, and their number is 0*(r^). 

We take the test set Q to consist of all the caps of the final forms, and argue that it satisfies the 
properties of a good test set: (a) Compactness: \Q\ = 0*{r^), so its size is small, (b) Shallowness: 
With high probability, each range in Q is (n/r )-shallow (and we assume that this property does 
indeed hold), (c) Containment: Each empty cap is also A^-empty, so, by the above canonization 
process, it is contained in the union of 0(1) caps of Q. (d) Efficiency: Each cap C £ Q is {a,(3)- 
covered, for appropriate fixed constants a, (3 > 0, in the terminology of [23], meaning that for each 
point p € dC there exists an a-fat triangle touching p, contained in C, and with diameter which 
is at least /3 times the diameter of C. In addition, the boundaries of any two ranges in Q (or of 
any two circular caps, for that matter) intersect in at most four points, as is easily checked. As 
follows from the recent analysis of de Berg [22] , the complexity of the union of any m ranges in Q 
is 0{XQ{m)\og^ m) = 0*{m). Hence, the complement of the union of any m ranges in Q can be 
decomposed into 0*{m) elementary cells, making C('^) = 0*{m). 

In conclusion, we obtain: 

Theorem 6.2 Let P be a set of n points in the plane. We can preprocess P, in near-linear time, 
into a data structure of linear size, so that, for any query circular cap C , larger than a semidisk, 
we can test whether C P is empty, in 0*(1) time. 

Combining Theorem 16.21 with parametric searching, we obtain: 

Corollary 6.3 Let P be a set of n points in the plane. We can preprocess P, in near-linear time, 
into a data structure of linear size, so that, for any query halfplane and point q , we can 
find, in 0*(1) time, the point in P Cii'^ nearest to q. 

Remark. The machinery developed in this section also applies to smaller circular caps, as long as 
they are not too small. Formally, if the central angle of each cap is at least some fixed constant 
a > 0, the same technique holds, so we can test emptiness of such ranges in 0*(1) time, using a 
data structure which requires 0(n) storage and 0*{n) preprocessing. Thus Theorem 16.21 carries 
over to this scenario, but Corollary 16.31 does not, because we have no control over the "fatness" 
of the cap, as the disk shrinks or expands, when the center of the disk lies in i~ , and once the 
canonical caps become too thin, the complexity of their union may become quadratic. 
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Reporting points in semidisks and circular caps. As in the case of fat triangles, we can 
extend the technique to answer efficiently range reporting queries in semidisks or in sufficiently 
large circular caps. We use the same canonization process, with respect to a random sample N 
of size 0(r log r) which is a shallow (l/r)-net, and argue, exactly as in the case of fat triangles, 
that the resulting collection of canonical caps is a good test set for shallow semidisk or larger cap 
ranges. Applying the machinery of Section [Sj we obtain a data structure of linear size, which can 
be constructed in near-linear time, and which can perform reporting queries in semidisks or larger 
caps, in time 0*(1) + 0{k), where k is the output size of the query. 

7 Approximate range counting 

Given a set P of n points in M'', a set r of semi-algebraic ranges of constant description complexity, 
and a parameter 5 > 0, the approximate range counting problem is to preprocess P into a data 
structure such that, for any query range 7 G F, we can efficiently compute an approximate count 
i-y which satisfies 

(1 - -5)|P n 7I < ^ < (1 + 6)\P n 7|. 

As in most of the rest of the paper, we will only consider the case where the size of the data 
structure is to be (almost) linear, and the goal is to find solutions with small query time. 

The problem has been studied in several recent papers [12lll3t ri4 t l29j. for the special case where 
P is a set of points in and F is the collection of halfspaces (bounded by hyperplanes). A variety 
of solutions, with near-linear storage, were derived; in all of them, the dependence of the query cost 
on n is close to n^~^/i-'^/'^i , which, as reviewed earlier, is roughly the same as the cost of halfspace 
range emptiness queries, or the overhead cost of halfspace range reporting queries |33j . 

The fact that the approximate range counting problem is closely related to range emptiness 
comes as no surprise, because, when P n 7 = 0, the approximate count t must be 0, so range 
emptiness is a special case of approximate range counting. The goal is therefore to derive solutions 
that are comparable, in their dependence on n, with those that solve emptiness (or reporting) 
queries. As just noted, this has been accomplished for the case of halfspaces. In this section we 
extend this technique to the general semi-algebraic case. 

The simplest solution is to adapt the technique of Aronov and Har-Peled [12], which uses a 
procedure for answering range emptiness queries as a "black box". Specifically, suppose we have 
a data structure, P(P'), for any set P' of n' points, which can be constructed in Tin') time, uses 
S{n') storage, and can determine whether a query range 7 G F is empty in Q{n') time. Using 
such a black box, Aronov and Har-Peled show how to construct a data structure for n points using 
0{{6^-^ + s{y/h/i^-2)S'(n)logn) storage and 0{{5^-^ + s[i//H/i^-2)T(n) log n) preprocessing, 
where A > 1 is some constant for which S{n/r) = 0{S{n) /r^) and T{n/r) = 0{T{n) /r^), for 
any r > 1. Given a range 7 € F, the data structure of [12] returns, in O {5~'^Q{n) log n) time, an 
approximate count t^, satisfying (1 — 5)\^ r\P\ < t-y < I7 n P|. 

The intuition behind this approach is that a range 7, containing m points of P, is expected to 
contain mr/n points in a random sample from P of size r, and no points in a sample of size smaller 
than n/m. The algorithm of [12] then guesses the value of a (up to a factor of 1 + 5), sets r to be 
an appropriate multiple of n/m, and draws many (specifically, 0{5~'^ logn) random samples of size 
r. If 7 is empty (resp., nonempty) for many of the samples then, with high probability, the guess 
for m is too large (resp., too small). When we cannot decide either way, we are at the correct value 
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of m (up to a relative error of 6). The actual details of the search are somewhat more contrived; 
see [l2] for those details. 

Plugging our emptiness data structures into the machinery of [12], we therefore obtain the 
following results. In all these applications we can take A = 1, so, in the terminology used above, 
the overall data structure uses 0{6~'^ S{n) log n) storage and 0{S~'^T{n) log n) preprocessing. 

Corollary 7.1 Let P be a set of n points in the plane, and let a, 6 be given positive parameters. 
Then we can preprocess P into a data structure of size 0{5~'^nlogn), in time 0{5~'^n^~^^), for any 
e > 0, such that, for any a-fat query triangle A, we can compute, in 0{5~'^n'^) time, for any e > 0, 
an approximate count t/s. satisfying {1 — 5)\A f] P\ <t/\ < |AnP|. 

Corollary 7.2 Let P be a set of n points in the plane, and let 5 be a given positive parameter. 
Then we can preprocess P into a data structure of size 0{5~'^nlogn), in time 0{5~'^n^~^'^), for 
any e > 0, such that, for any line i, point p on £ or above i, and distance d, we can compute, in 
0{6~'^n'^) time, for any e > 0, an approximate count ti,p,d of the exact number N^^p^d of the points 
of P which lie above £ and at distance at most d from p, so that (1 — 5)Ni^p^(i < ti^p^d < N^^p^d- 

Corollary 7.3 Let P be a set of n points in R^, T a collection of convex semi-algebraic ranges of 
constant description complexity, and 5 a given positive parameter. Then we can preprocess P into 
a data structure of size 0((5^^n log n), in time 0{6~'^n^'^'^), for any e > 0, such that, for any query 
range 7 € T, we can compute, in 0(e~^?i^/^+^ log n) time, for any e > 0, an approximate count t^ 
of the number of points of P outside 7, satisfying (1 — 6)\^'^ H P| < ^ H P|. 

Corollary 7.4 Let B be a set of n balls in M^, and let 5 be a given positive parameter. Then 
we can preprocess B into a data structure of size 0{5~'^nlogn), in time 0(5~^n^~^^), for any 
e > 0, such that, for any query ray p, we can compute, in 0{e~'^n'^/^~^^ logn) time, for any e > 0, 
an approximate count tp of the exact number Np of the balls of B intersected by p, satisfying 
(1 - 6)Np <tp< Np. 

Remark: Another approach to approximate range counting has been presented in [13\ I14j . in 
which, rather than using range emptiness searching as a black box, one modifies the partition 
tree of the range emptiness data structure, and augments each of its inner nodes with a so-called 
relative (p, e) approximation sets, which are then used to obtain the approximate count of a range. 
This approach too can be adapted to yield efficient approximate range counting algorithms for 
semialgebraic ranges, with a slightly improved dependence of their performance on 6. We omit 
details of such an adaptation in this paper. 

8 Conclusion 

In this paper we have presented a general approach to efficient range emptiness searching with 
semi-algebraic ranges, and have applied it to several specific emptiness searching and ray shooting 
problems. The present study resolves and overcomes the technical problems encountered in our 
earlier study [l2], and presents more applications of the technique. 

Clearly, there are many other applications of the new machinery, and an obvious direction for 
further research is to "dig them up". In each such problem, the main step would be to design a 
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good test set, with associated function ("(•) as small as possible, using either the general recipe or an 
appropriate ad-hoc analysis. Many specific instances of this step are likely to generate interesting 
(and often hard) combinatorial questions. For example, as already mentioned earlier, we still do 
not know whether the complement of the union of n (congruent) cylinder in can be decomposed 
into 0*{n^) elementary cells. 
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